| Human Factors | Other Factors |
|---|---|
| Participant behavior | Equipment failures |
| Evaluator errors | Records/Databases |
| Partner behavior | Unusual Events |
Presentation to MSU Department of Psychology, Program Evaluation Occasional Speaker Series, East Lansing, MI
2024-12-05
Missing data (MD) are measurements you want or intended to collect but did not get.[1]
Data collection doesn’t always go according to plan…
| Human Factors | Other Factors |
|---|---|
| Participant behavior | Equipment failures |
| Evaluator errors | Records/Databases |
| Partner behavior | Unusual Events |
Handling missing data well enacts our guiding principles[2]:
There are 3 major scientific activities that can be affected by missing data.
A representative sample is crucial to generalizing to the intended population!
How much data is there? Data volume is \(N_{values} = P \times V \times T\)
Report numbers & percentages of:
Use a Good Tracking System
Track attendance at data collection events & participants’ exit from the study.
Tip
We can aggregrate and visualize R to describe patterns of missingness!
Missingness patterns for Dutch boys growth study data (748 boys, 9 variables, 1 time point)[7]
Impact on Statistical Results[3]
Some mechanisms yield more bias: MCAR < MAR < MNAR
MCAR is when neither observed nor unobserved values predict which values are missing.
MAR is when observed values predict which values are missing.
MNAR is when unobserved values predict which values are missing.
Classifying large datasets according to Rubin’s mechanisms is messy.
Tip
Consider predictors of person, item, construct, and person-period missingness. Think carefully about your study context and data to look for meaningful, sensible things to test when evaluating missing data issues.
“An ounce of prevention is worth a pound of cure.” (Benjamin Franklin, 1735)
Every variable is an opportunity for missing data.
| Data Deletion | Single Imputation |
|---|---|
| Listwise deletion | Mean Substitution |
| Pairwise deletion | Hot-Deck Imputation |
| Available Items Analysis | Regression Imputation |
| Last Observation Carried Forward |
FIML estimates parameters by combining observed data, relationships among observed variables, and assumptions about distributions.
MI replaces each missing value with m indpendent draws from the variable’s conditional distribution.
MVA and MULTIPLE IMPUTATION syntax commands